Show the code
import pandas as pd
import numpy as np
from lets_plot import *
LetsPlot.setup_html(isolated_frame=True)import pandas as pd
import numpy as np
from lets_plot import *
LetsPlot.setup_html(isolated_frame=True)For Project 1 the answer to each question should include a chart and a written response. The years labels on your charts should not include a comma. At least two of your charts must include reference marks.
# Learn morea about Code Cells: https://quarto.org/docs/reference/cells/cells-jupyter.html
# Include and execute your code here
df = pd.read_csv("https://github.com/byuidatascience/data4names/raw/master/data-raw/names_year/names_year.csv")How does your name at your birth year compare to its use historically?
This section compares the use of the name “jai” in your birth year to its historical usage.The empty graph indicates there are no recorded instances of the name “jai” in the given data. Also the data indicate the name is exceptionally rare in the United States.
# label: jai Name Graph
# code-summary: Compare the use of "jai" over time
# fig-cap: "Historical usage of the name 'jai' over time (no data available)"
# fig-align: center
name_jai = df.query("name == 'jai'")
chart_jai = (ggplot(name_jai, aes('year', 'Total')) +
geom_line(color='blue') +
ggtitle("Historical Usage of the Name 'jai'") +
xlab('Year') +
ylab('Count') +
geom_hline(yintercept=0, linetype='dotted', color='red') +
theme(axis_text_x=element_text(angle=45, hjust=1)))
chart_jai.show()If you talked to someone named Brittany on the phone, what is your guess of his or her age? What ages would you not guess?
Brittany was very popular in the late 1980s and 1990s. I would guess the person is in their 30s or so. would not guess 60+, since Brittany was rarely used that long ago.
# label: Brittany Graph
# code-summary:Read and format data
# fig-cap: "Name popularity for 'Brittany' with reference line"
# fig-align: center
# Include and execute your code here
brittany_df = df.query("name == 'Brittany'")
brittany_chart = (
ggplot(brittany_df, aes('year', 'Total')) +
geom_line(color='blue') +
ggtitle("Brittany Graph") +
geom_vline(xintercept=1990, linetype='dashed', color='red') +
xlab("Year") +
ylab("Total Count") +
theme(axis_text_x=element_text(angle=45, hjust=1))
)
brittany_chart.show()Mary, Martha, Peter, and Paul are all Christian names. From 1920 - 2000, compare the name usage of each of the four names in a single chart. What trends do you notice?
This section compares the usage of “Mary,” “Martha,” “Peter,” and “Paul” between 1920 and 2000. Mary was extremely popular early (1920–1960), then declined. Martha had moderate usage, peaking mid-century, then tapered off. Peter and Paul show steadier usage but also decline after 1970.
# label: Biblical Names Graph
# code-summary: Read and format data
# fig-align: center
# Include and execute your code here
biblical_names = ['Mary', 'Martha', 'Peter', 'Paul']
biblical_data = df.query("name in @biblical_names and 1920 <= year <= 2000")
chart_biblical = (ggplot(biblical_data, aes('year', 'Total', color='name')) +
geom_line() +
ggtitle("Trends of Biblical Names (1920-2000)") +
xlab('Year') +
ylab('Count') +
theme(axis_text_x=element_text(angle=45, hjust=1)))
chart_biblical.show()Think of a unique name from a famous movie. Plot the usage of that name and see how changes line up with the movie release. Does it look like the movie had an effect on usage?
The name “Peter” is analyzed to see how its popularity was affected by the release of the movie Spider-Man (2002, 2012, 2017) series. No strong effect visible in usage; it has been steadily trending downward.
# label: Movie Name Chart
# code-summary: Analyze and plot trends for the name "Peter"
# fig-cap: "Trends for the name 'Peter' over time, including Spider-Man movie release years"
# fig-align: center
# Include and execute your code here
name_peter = df.query("name == 'Peter'")
chart_peter = (ggplot(name_peter, aes('year', 'Total')) +
geom_line(color='green') +
geom_vline(xintercept=2002, linetype='dotted', color='red') +
geom_vline(xintercept=2012, linetype='dotted', color='blue') +
geom_vline(xintercept=2017, linetype='dotted', color='orange') +
ggtitle("Trends for the Name 'Peter' with Movie Releases") +
xlab('Year') +
ylab('Count') +
theme(axis_text_x=element_text(angle=45, hjust=1),
axis_text_y=element_text(size=8)))
chart_peter.show()Reproduce the chart Elliot using the data from the names_year.csv file.
The name “Elliot” is analyzed similarly to how previous charts were created. This chart highlights the name “Elliot” with key milestones, such as the release of the movie E.T. and subsequent events that influenced its popularity. To recreate this chart, including the vertical reference lines for specific release dates
# label: Elliot Chart with Milestones
# code-summary: Analyze the trends for 'Elliot' and highlight key release events
# fig-cap: "Trends for the name 'Elliot' with key movie release milestones"
# fig-align: center
# Create the line chart for the name "Elliot"
name_elliot = df.query("name == 'Elliot' and year >= 1950")
chart_elliot = (ggplot(name_elliot, aes('year', 'Total')) +
geom_line(color='orange') +
geom_vline(xintercept=1982, linetype='dashed', color='red') +
geom_vline(xintercept=1988, linetype='dashed', color='red') +
geom_vline(xintercept=2002, linetype='dashed', color='red') +
ggtitle("Trends for the Name 'Elliot' with Movie Milestones") +
xlab('Year') +
ylab('Count') +
theme(axis_text_x=element_text(angle=45, hjust=1),
axis_text_y=element_text(size=8)))
chart_elliot.show()